AITopics | regularized empirical risk minimization

Collaborating Authors

regularized empirical risk minimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

Neural Information Processing SystemsNov-21-2025, 15:41:50 GMT

We develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches has become a golden standard in the machine learning community, because the mini-batch techniques stabilize the gradient estimate and can easily make good use of parallel computing. The core of our proposed method is the incorporation of our new ``double acceleration'' technique and variance reduction technique. We theoretically analyze our proposed method and show that our method much improves the mini-batch efficiencies of previous accelerated stochastic methods, and essentially only needs size $\sqrt{n}$ mini-batches for achieving the optimal iteration complexities for both non-strongly and strongly convex objectives, where $n$ is the training set size. Further, we show that even in non-mini-batch settings, our method achieves the best known convergence rate for non-strongly convex and strongly convex objectives.

doubly accelerated stochastic variance reduced, regularized empirical risk minimization, variance reduced dual averaging method, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An Accelerated Proximal Coordinate Gradient Method

Qihang Lin, Zhaosong Lu, Lin Xiao

Neural Information Processing SystemsFeb-9-2025, 13:07:49 GMT

We develop an accelerated randomized proximal coordinate gradient (APCG) method, for solving a broad class of composite convex optimization problems. In particular, our method achieves faster linear convergence rates for minimizing strongly convex functions than existing randomized proximal coordinate gradient methods. We show how to apply the APCG method to solve the dual of the regularized empirical risk minimization (ERM) problem, and devise efficient implementations that avoid full-dimensional vector operations. For ill-conditioned ERM problems, our method obtains improved convergence rates than the state-ofthe-art stochastic dual coordinate ascent (SDCA) method.

artificial intelligence, descent method, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Reviews: Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

Neural Information Processing SystemsOct-8-2024, 01:02:31 GMT

The paper proposes a novel doubly accelerated variance reduced dual averaging method for solving the convex regularized empirical risk minimization problem in mini batch settings. The method essentially can be interpreted as replacing the proximal gradient update of APG method with the inner SVRG loop and then introducing momentum updates in inner SVRG loops. Finally to allow lazy updated, primal SVRG is replaced with variance reduce dual averaging. The main difference from AccProxSVRG is the introduction of momentum term at the outer iteration level also. The method requires only O(sqrt{n}) sized mini batches to achieve optimal iteration complexities for both convex and non-convex functions when the problem is badly conditioned or require high accuracy.

artificial intelligence, doubly accelerated stochastic variance reduced, regularized empirical risk minimization, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

An Accelerated Proximal Coordinate Gradient Method

Neural Information Processing SystemsMar-13-2024, 10:47:21 GMT

apcg method, coordinate descent method, descent method, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

Murata, Tomoya, Suzuki, Taiji

Neural Information Processing SystemsFeb-14-2020, 05:57:23 GMT

We develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches has become a golden standard in the machine learning community, because the mini-batch techniques stabilize the gradient estimate and can easily make good use of parallel computing. The core of our proposed method is the incorporation of our new double acceleration'' technique and variance reduction technique. We theoretically analyze our proposed method and show that our method much improves the mini-batch efficiencies of previous accelerated stochastic methods, and essentially only needs size $\sqrt{n}$ mini-batches for achieving the optimal iteration complexities for both non-strongly and strongly convex objectives, where $n$ is the training set size. Further, we show that even in non-mini-batch settings, our method achieves the best known convergence rate for non-strongly convex and strongly convex objectives.

doubly accelerated stochastic variance reduced, regularized empirical risk minimization, variance reduced dual averaging method, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Best-scored Random Forest Classification

Hang, Hanyuan, Liu, Xiaoyu, Steinwart, Ingo

arXiv.org Machine LearningMay-27-2019

We propose an algorithm named best-scored random forest for binary classification problems. The terminology "best-scored" means to select the one with the best empirical performance out of a certain number of purely random tree candidates as each single tree in the forest. In this way, the resulting forest can be more accurate than the original purely random forest. From the theoretical perspective, within the framework of regularized empirical risk minimization penalized on the number of splits, we establish almost optimal convergence rates for the proposed best-scored random trees under certain conditions which can be extended to the best-scored random forest. In addition, we present a counterexample to illustrate that in order to ensure the consistency of the forest, every dimension must have the chance to be split. In the numerical experiments, for the sake of efficiency, we employ an adaptive random splitting criterion. Comparative experiments with other state-of-art classification methods demonstrate the accuracy of our best-scored random forest.

artificial intelligence, machine learning, random forest, (16 more...)

arXiv.org Machine Learning

1905.11028

Country:

North America > United States > New York (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Classifying Big Data over Networks via the Logistic Network Lasso

Ambos, Henrik, Tran, Nguyen, Jung, Alexander

arXiv.org Machine LearningMay-7-2018

ABSTRACT We apply network Lasso to solve binary classification (clustering) problems on network structured data. To this end, we generalize ordinary logistic regression to non-Euclidean data defined over a complex network structure. A scalable classification algorithm is obtained by applying the alternating direction methods of multipliers to solve this optimization problem. Index Terms-- compressed sensing, big data over networks, semi-supervised learning, classification, clustering, complex networks, convex optimization I. INTRODUCTION We consider the problem of classifying or clustering a large set of data points which conform to an underlying network structure. Such network-structured datasets arise in a wide range of application domains, e.g., image-and video processing as well as social networks [1].

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1805.02483

Country:

North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Finland (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.61)

Add feedback

An Accelerated Proximal Coordinate Gradient Method

Lin, Qihang, Lu, Zhaosong, Xiao, Lin

Neural Information Processing SystemsDec-31-2014

We develop an accelerated randomized proximal coordinate gradient (APCG) method, for solving a broad class of composite convex optimization problems. In particular, our method achieves faster linear convergence rates for minimizing strongly convex functions than existing randomized proximal coordinate gradient methods. We show how to apply the APCG method to solve the dual of the regularized empirical risk minimization (ERM) problem, and devise efficient implementations that can avoid full-dimensional vector operations. For ill-conditioned ERM problems, our method obtains improved convergence rates than the state-of-the-art stochastic dual coordinate ascent (SDCA) method.

artificial intelligence, descent method, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback